Pentium® Pro Processor Design for Test and Debug
نویسندگان
چکیده
1 Its microarchitecture forms the basis of the company's future high-volume microprocessor portfolio. The initial design has also been augmented with enhancements such as MMX technology and compacted onto newer fabrication processes to create the Pentium II processor product line. 2 The original Pentium Pro processor design , known initially as the P6, introduced several performance features, including s register and flag renaming s speculative and out-of-order dispatch and execution of instructions s reordered memory access s multiple branch prediction s a full-speed second-level (L2) cache ac-cessed through a dedicated 72-bit backside bus s a 64-bit, transaction-oriented, pipelined front-side bus, which operates at GTL voltage levels and directly supports four-way multiprocessing systems The design was first implemented in Intel's proprietary 0.6-µm, four-metal Bi-CMOS process. The CPU die measured 683 mils (17.3 mm) on a side and contained 5.5 million transistors. Designers created two versions of the shipped product by combining the CPU with either a 15.5-million transistor , 256-Kbyte L2 cache or a 31-million transistor, 512-Kbyte L2 cache. In either case, the CPU and L2 cache dies were mounted in a 387-pin, dual-cavity, ceramic PGA package , with wire-bonded connections to the pad rings. The challenges associated with production test and debug on the processor were considerable, due to its tight release schedule , the enormous complexity of its circuit design and microarchitecture, and the complexity of the system platform. In addition to these issues, Intel's unique business requirement of simultaneously meeting very high production, performance, and test quality targets strongly influenced its design-for-test direction. This set of constraints limits Intel microprocessor design teams' ability to use DFT and test generation techniques (full or partial scan and scan-based BIST), which are commonly used in other microprocessors in the industry. 3-7 Within these constraints, the design team optimized the design for low die area, high performance, low power dissipa-tion, high test quality, and low test cost. Die area. Raw fab output is a function of wafer capacity and product die size. In addition to the die area impact on yield, more fabs must be built to ship a larger design, assuming a fixed quantity of product. An independent study has estimated that a die area increase of 15% in one particular Pentium processor design would have cost Intel the construction of a new multibillion-dollar fab. 8 For this reason, die area plays an overwhelming role in the economics of high-The need …
منابع مشابه
Full Hold-Scan Systems in Microprocessors: Cost/Benefit Analysis
Ever-shrinking microprocessor product development times require enhanced High-Volume Manufacturing (HVM) techniques. This paper describes the full holdscan testing system implemented in the 90nm Intel Pentium 4 processor. Benefits of this scan system include significantly reduced functional test-writing and fault-grade effort, extensive initialization of the design for test and debug, massive v...
متن کاملPerformance Characterization of the Pentium(r) Pro Processor
In this paper, we characterize the performance of several business and technical benchmarks on a Pentium Pro processor based system. Various architectural data are collected using a performance monitoring counter tool. Results show that the Pentium Pro processor achieves significantly lower cycles per instruction than the Pentium processor due to its out of order and speculative execution, and ...
متن کاملPentium III Processor Implementation Tradeoffs
This paper discusses the implementation tradeoffs of the Pentium III processor. The Pentium III processor implements a new extension of the IA-32 instruction set called the Internet Streaming Single-Instruction, MultipleData (SIMD) Extensions (Internet SSE). The processor is based on the Pentium Pro processor microarchitecture. The initial development goals for the Pentium III processor were ...
متن کاملPerformance Characterization of a Quad Pentium Pro SMP Using OLTP Workloads1
Commercial applications are an important, yet often overlooked, workload with significantly different characteristics from technical workloads. The potential impact of these differences is that computers optimized for technical workloads may not provide good performance for commercial applications, and these applications may not fully exploit advances in processor design. To evaluate these issu...
متن کاملCA-BIST for Asynchronous Circuits: A Case Study on the RAPPID Asynchronous Instruction Length Decoder
This paper presents a case study in low-cost noninvasive Built-In Self Test (BIST) for RAPPID, a largescale 120,000-transistor asynchronous version of the Pentium R Pro Instruction Length Decoder, which runs at 3.6 GHz. RAPPID uses a synchronous 0.25 micron CMOS library for static and domino logic, and has no Design-for-Test hooks other than some debug features. We explore the use of Cellular A...
متن کامل